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We present an analytic method for calculating spectral densities of empirical covariance matrices 
for correlated data. In this approach the data is represented as a rectangular random matrix 
whose columns correspond to sampled states of the system. The method is applicable to a class of 
random matrices with radial measures including those with heavy (power-law) tails in the probability 
distribution. As an example we apply it to a multivariate Student distribution. 



I. INTRODUCTION 

Random Matrix Theory provides a useful tool for description of systems with many degrees of freedom. A large 
spectrum of p roblems in physics 0, telecommunication, information theory 0- 0- El 01 and quantitative finance 
[6l r7l la. la. IIOL fill tia. [l3j can be naturally formulated in terms of random matrices. 

In this paper we apply random matrix theory to calculate the eigenvalue density of the empirical covariance matrix. 
Statistical properties of this matrix play an important role in many empirical applications. More precisely, the problem 
which we shall discuss here can be generally formulated in the following way. Consider a statistical system with N 
correlated random variables. Imagine that we do not know a priori correlations between the variables and that we try 
to learn about them by sampling the system T times. Results of the sampling can be stored in a rectangular matrix X 
containing empirical data Xu, where the indices i = 1, . . . , N and t = 1, . . . T run over the set of random variables and 
measurements, respectively. If the measurements are uncorrelated in time the two-point correlation function reads: 

(Xi ltl Xi 2 t 2 ) = Ci 1 i 2 S tl t 2 - (1) 

where C is called correlation matrix or covariance matrix. For simplicity assume that (Xn) = 0. If one does not know 
C one can try to reconstruct it from the data X using the empirical covariance matrix: 



1 T 

c ij = 7jn XgXju (2) 



t=i 

which is a standard estimator of the correlation matrix. One can think of X as of an N x T random matrix chosen 
from the matrix ensemble with some prescribed probability measure P(X)DX. The empirical covariance matrix: 

c = ixX^ (3) 

depends thus on X. Here X r stands for the transpose of X. For the given random matrix X the eigenvalue density 
of the empirical matrix c is: 

1 - 

p(X,A)^-^ < 5(A-A,(c)), (4) 

i=l 

where Aj(c)'s denote eigenvalues of c. Averaging over all random matrices X: 

p(X) ee (p(X, A)} = I p(X, A) P(X) DX, (5) 

we can find the eigenvalue density of c which is representative for the whole ensemble of X. We are interested in 
how the eigenvalue spectrum of c is related to that of C 0,0,0- Clearly, as follows from QJ, the quality of the 
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information encoded in the empirical covariance matrix c depends on the number of samples or more precisely on the 
ratio r = N/T. Only in the limit T — > oo, that is for r — > 0, the empirical matrix c perfectly reproduces the real 
covariance matrix C. Recently a lot of effort has been made to understand the statistical relation between c and C 
for finite r. This relation plays an important role in the theory of portfolio selection where Xu are identified with 
normalized stocks' returns and C is the covariance matrix for inter-stock correlations. It is a common practice to 
reconstruct the covariance matrix from historical data using the estimator (J2J). Since the estimator is calculated for 
a finite historical sample it contains a statistical noise. The question is how to optimally clean the spectrum of the 
empirical matrix c from the noise in order to obtain a best quality estimate of the spectrum of the underlying exact 
covariance matrix C. One can consider a more general problem, where in addition to the correlations between the 
degrees of freedom (stocks) there are also temporal correlations between measurements [l7j : 

(X iltl X i2 t 2 ) — Ci 1 i 2 A tl t 2 , (6) 

given by an autocorrelation matrix A. If X is a Gaussian random matrix, or more precisely if the probability measure 
P(X)DX is Gaussian, then the problem is analytically solvable in the limit of large matrices [171,1181 Il9l l2cj | . One can 
derive then an exact relation between the eigenvalue spectrum of the empirical covariance matrix c and the spectra 
of the correlation matrices A and C. 

In this paper we present an analytic solution for a class of probability measures P(X)DX for which the marginal 
distributions of individual degrees of freedom have power law tails: p{Xu) ~ XZ ~ v which means that the cumulative 
distribution function falls like X~ t v . Such kind of distributions has been discussed previously 0, but, up to 
our knowledge, the spectral density of c remained unattainable analytically. The motivation to study such systems 
comes from the empirical observation that stocks' returns on financial markets undergo non-Gaussian fluctuations 
with power-law tails. The observed value of the power-law exponent i/«3 seems to be universal for a wide class of 
financial assets jU 13, ■ Random matrix ensembles with heavy tails have been recently considered for < v < 2 
using the concept of Levy stable distributions [26l W\ |2S| . Here we will present a method which extrapolates also to 
the case v > 2, being of particular interest for financial markets. 

We will study here a model which on the one hand preserves the structure of correlations JBJ and on the other 
hand has power-law tails in the marginal probability distributions for individual matrix elements. More generally, we 
will calculate the eigenvalue density of the empirical covariance matrix c © for random matrices X which have a 
probability distribution of the form: 

P f (X)DX = JV-\f(Tr X T C~ 1 XA- 1 )DX, (7) 

where DX = Ui.i=i dx 

it is a volume element. The normalization constant Af: 

TV = ^(DetCf/^DetA)^ 2 (8) 

and the parameter d = NT have been introduced for convenience. The function / is an arbitrary non-negative 
function such that P(X) is normalized: J P(X)DX = 1. 

In particular we will consider an ensemble of random matrices with the probability measure given by a multivariate 
Student distribution: 

PAX)BX = M^{ 1 + ^ TT X^XA" 1 ) 2 DX - (9) 



The two-point correlation function can be easily calculated for this measure: 

T 2 

v — 2 



(Xi ltl X i2 t 2 ) — ^Qiia^fe. (10) 



We see that for a 2 = v — 2 and for v > 2 the last equation takes the form ©. With this choice of a 2 the two-point 
function becomes independent on 1/, however the formula for the probability measure breaks down at v = 2 and 
cannot be extrapolated to the range < v < 2. An alternative and actually a more conventional choice is <j 2 = v 
which extrapolates easily to this range. In this case one has to remember that for v > 2 the exact covariance matrix 
is given by C, where C is the matrix in Eq. JHJ with a 2 = v. We will stick to this choice in the remaining part 
of the paper. 

The marginal probability distribution for a matrix element Xu can be obtained by integrating out all others degrees 
of freedom from the probability measure P(X)DX. One can see that for the Student probability measure JHJ the 
marginal distributions of individual elements have by construction power-law tails. For example if C is diagonal 



C = Diag(C 2 , . . . , Cjj) and A = It then the marginal probability distributions can be found exactly for each element 
of the matrix X: 

r( u+1 ) f X 2 \ 

"'< x "' = r(|bfe( 1 + 4) ' 

The distributions pi fall like ~ X^ 1-1 ' for large Xn with amplitudes which depend on the index i and are independent 
of t. If one thinks of a stock market, this means that stocks' returns have the same tail exponent but different tail 
amplitudes. The independence of t means that the distributions pi(Xa) are stationary. More generally, for any C and 
for A which is translationally invariant A tl t 2 = A(|ti— £2)) the marginal distributions of entries Xa can be shown to 
have power-law tails with the same exponent v for all Xa and tail coefficients which depend on i and are independent 
of t, exactly expected from stocks' returns on a financial market. 

The main purpose of this paper is to calculate the spectral density of the empirical covariance matrix c for the 
Student distribution (JSJ . The method is similar to the one presented in [2^, |3(1 0, 113, HI] for a square Hermitian 
matrix. It consists in an observation that every quantity averaged over the probability distribution having the form l(7)l 
can be first averaged over [d— 1) "angular" variables and then of a "radial" variable. This shall be shortly presented 
in sections II and III. In the section IV the main equation for the eigenvalue density of c for the radial ensemble 
(J7J) with an arbitrary radial profile / shall be given. The section V contains results for the Student distribution © 
including some special cases. 



II. RADIAL MEASURES 
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= 0C0 T 


A 


+ A 


= Q T AQ 


X 


4 X 


= 0XQ r 



The radial measure JjJ depends on one scalar function / = f(x 2 ) of a real positive argument. In this section we 
shall develop a formalism to calculate the eigenvalue spectrum p f (A) of the empirical covariance matrix J3J for such 
radial ensembles. The calculation can be simplified by noticing that the dependence of p/(A) on the matrices C and 
A actually reduces to a dependence on their spectra. This follows from an observation that for a radial measure (0 
the integral JHJ defining the eigenvalue density is invariant under simultaneous transformations: 

(12) 

where 0, Q are orthogonal matrices of size N x N and T x T, respectively. Choosing the orthogonal transformations 
and Q in such a way that C and A become diagonal: C = Diag(C 2 , . . . , Cjy), A = Diag(A 2 , . . . , A^) with all Cj's 
and AtS being positive, we see that p/(A) depends on the matrices C and A indeed only through their eigenvalues. 
Therefore, for convenience we shall assume that C and A are diagonal from the very beginning. 

The radial form of the measure allows one to determine the dependence of the eigenvalue density p/(A) on the 
radial profile f{x 2 ). Intuitively, the reason for that stems from the fact that one can do the integration for the 
radial ensembles Q in two steps: the first step is a sort of angular integration which is done for fixed x and thus 
is independent of the radial profile f(x 2 ), and the second one is an integration over x. A short inspection of the 
formula (JJJ) tells us that fixed x corresponds to fixed trace: Tr X r C _1 XA _1 , and thus that we should first perform 
the integration over the fixed trace ensemble. We shall follow this intuition below. 

Let us define a matrix x = C~2XA~2. Since we assumed that A and C are diagonal, A 1 / 2 and C 1//2 are also 
diagonal with elements being square roots of those for A and C. The elements of x are: 

xu 55 (i3) 

They can be viewed as components Xj, j — 1, ...,d of a <i-dimensional Euclidean vector, where the index j is 
constructed from i and t. The length of this vector is: 



d NT 
i=l t=l 



■'" " y =22z2 x ^= Tr xTx = Tr X r C" 1 XA- 1 , (14) 



and thus the fixed trace matrices X are mapped onto a c?-dimensional sphere of the given radius x. It is convenient 
to parameterize the d-dimensional vector x using spherical coordinates x = xu, where uj 2 = Tr u> T u> = 1. We can 



also use these coordinates to represent the matrix X: 

X = C^xAi = xC?u>A? = xfl(u), 
n(w) = C'wAi, (15) 

where the definition of the matrix fi(u>) is equivalent to f2j t = Ci-Atti;^. While a; gives a point on a unit sphere in 
(i-dimensional space, gives a radial projection of this point on a d-dimensional ellipsoid of fixed trace: 

Tr fTC 1 = 1. (16) 



III. ANGULAR INTEGRATION 



We are now prepared to do the integration over the angular variables Da;. In the spherical coordinates l)15|l the 
radial measure JJJ assumes a very simple form: 

P/(X)DX = w- d/2 f(x 2 )x d - 1 dx Dw. (17) 

The normalization factor A/" -1 from Eq. Q cancels out. The spherical coordinates X = a;f2(u;) allow us to write the 
formula for p/(A) in the form: 



p f {\)=ir- d ^ 2 J p(X,X) P f (X) DX = ^- d/2 J Dw| p (xSl(w), A) f{x 2 )x d - 1 da 



(18) 



Although the integration over the angular and the radial part cannot be entirely separated, we can partially decouple 
x from Jl in the first argument of p(xQ(uj),\). It follows from Q that the rescaling X — > aX by a constant gives 
the relation: 

p(oX, A) = a- 2 p(X, a- 2 \). (19) 
This observation can be used to rewrite the equation l|18|l in a more convenient form: 

o/W-^/n-jf />(«>(«), ^) ^f^*) (20) 

where r(z) is the Euler gamma function and 

p m (X) = ^-J p(fl(u) t X)Tkj. (21) 

Here denotes the hyper-surface area of ei-dimensional sphere of radius one: Sd = 27r rf / 2 /r(|). As we shall see 
below the last expression is an eigenvalue distribution of the empirical covariance matrix for the fixed trace ensemble 
defined as an ensemble of matrices X such that Tr X r C _1 XA _1 = 1. From the structure of the equation (|20|) it is 
clear that if p*(A) is known then p/(A) can be easily calculated for any radial profile just by doing one-dimensional 
integral. So the question which we face now is how to determine p*(A) for arbitrary C and A. We will do this by 
a trick. Instead of calculating p*(A) directly from Eq. H21|l. we will express p* (X) by the corresponding eigenvalue 
density Pg(A) for a Gaussian ensemble, whose form is known analytically ^lUg. Let us follow this strategy in the 
next section. 

IV. FIXED TRACE ENSEMBLE AND GAUSSIAN ENSEMBLE 

The probability measure for the fixed trace ensemble is defined as 

T(-) 

P*(X)DX = -j^A 5 (Tr (X T C" 1 XA- 1 ) - l) DX. (22) 
In the spherical coordinates uj the formula reads: 

P*(X)DX = -J- 5{x 2 - 1) x d ~ x dx Dw. 
Sd 



One can easily check that the integration p*(A) = J p(X, A)P*(X)DX indeed gives (|21|l . It is also worth noticing that 
the normalization condition for P»(X) is fulfilled. Consider now a Gaussian ensemble: 



P G (X)DX = W^/cOTr X T C- 1 XA- 1 )DX, (23) 

where 

fc{x 2 ) = ^- 2 ^ x \ (24) 

for which the spectrum p q(X ) is known or more precisely it can be easily computed numerically in the thermodynamical 
limit N,T -> oo [13, IS Ml- On the other hand as we learned in the previous section, the density of eigenvalues of 
the empirical covariance matrix c can be found applying Eq. (|2U|I to the Gaussian radial profile <|24[1 : 

nl-d/2 roc / A \ 

"><*> = -tWL »,{^)^e-i'^ (25, 
Changing the integration variable to y: x 2 = dy 2 and rescaling the spectrum p G by d: A = 4 we eventually obtain: 



dp G {d\) 



A\ 1 



y 2 J y 2 



y a i e 2 v 

r(f) 



dy. (26) 



One can easily check that the formula in the square brackets tends to the Dirac delta for large matrices because then 
d goes to infinity: 

ol-d/2 id/2 

lim f y^e-^ =5(y-l), 

d^co r(|) 

and thus the integrand in Eq. I|26|) gets localized around the value y = 1. Therefore for large d we can make the 
following substitution: 

p*(A) =d PG {d\). (27) 

Inserting it into Eq. 120|) and changing the integration variable to y = ^ we finally obtain a central equation of this 
paper: 

The meaning of this formula is the following: for any random matrix ensemble with a radial measure J7J the eigenvalue 
density function p/(A) is given by a one-dimensional integral of a combination of the corresponding Gaussian spectrum 
Pg(A) and the radial profile f(x). The equation holds in the thermodynamic limit: d = NT — > oo and r = N/T = 
const. Since in this limit we are able to calculate the spectrum pg(A) for arbitrarily chosen A, C, the formula 128() 
gives us a powerful tool for computing spectra of various distributions. In the next section we shall apply it to the 
multivariate Student ensemble JHl- 



V. MULTIVARIATE STUDENT ENSEMBLE 



The radial profile for the Student ensemble © is: 



f( X 2 ) ee U{x 2 ) = ^Qj, ( 1 + - ) . (29) 



r(|)^/ 2 V v 

We have chosen here the standard convention a 2 = v since we would like to calculate the spectrum /?„(A) also for 
v < 2 (see the discussion at the end of the first section). Inserting H29|l into the equation Ij28(l : 



m /dW2 r(^) ... , r°° f / d\\ 



y 2 ay, 



and taking the limit d — > oo: 

hm - j y 2 A 2 1H = 7: e^y 2 A 2 , 

we see that the expression for p v (X) simplifies to an expression which is independent of d: 

PuW = f^j © V/ ^"*~ 1 Pg(2/) e"*M dy. (30) 

The formula 1(30(1 works for all ^ > 0. From the last equation we can infer the behavior of p v {\) for large A. The 
function pc(y) has a compact support 0,H3, therefore for large A the exponential can be approximated well by 
1. The function p v {X) has thus a long tail: 

frW srr 'f(E)(2) 7 Poiv)v**U, (31) 

where the integral does not depend on A. The exponent — l>/2 — 1 in the above power-law depends on the index v 
of the original Student distribution. The change from the power v to the power v/2 comes about because c is a 
quadratic combination of X. 

The power-law tail in the eigenvalue distribution (|31(l does not disappear in the limit of large matrices contrary 
to the power-law tails in the eigenvalue distribution for an ensemble of matrices whose elements are independently 
distributed random numbers. For such matrices, for v > 2, the density p(X) falls into the Gaussian universality class 
and yields the Wishart spectrum |36| . One should remember that the multivariate Student distribution © discussed 
here does not describe independent degrees of freedom even for A = ly and C = Ijv, in which case the degrees of 
freedom are "uncorrelated" but not independent. 

We have learned that the spectrum is unbounded from above. Let us now examine the lower limit of the spectrum. 
Rewriting Eq. (|3ll|) in the form: 

2,/V 2 r°° 

Pu{\) = ^ J o PG (2sA) e— x^ 2 dx, (32) 

we see that as long as A > the function p v (X) is positive since pg{%) is positive on a finite support. Thus the function 
Pi, (A) vanishes only at A = and it is positive for any A > 0. Contrary to the classical Wishart distribution for the 
Gaussian measure, the spectrum <|30[1 spreads over the whole real positive semi-axis. On the other hand, taking the 
limit v — > oo of Eq. I|32|l and using the formula: 

I"* %^x v/2 e-" x = S(x-l/2), (33) 

we obtain p„^oo(A) = Pg(A) as expected, because in this limit the radial profile f v {x 2 ) given by Eq. 1)29(1 for the 
Student distribution reduces to the Gaussian one lf2"l)) . 



VI. EXAMPLES 



Let us first consider the case without correlations: C = Ijv and A = It- The spectrum of the empirical covariance 
for the Gaussian ensemble is given by the Wishart distribution: 



Pg(X) = 7 -^v'(A + -A)(A-A_), 
where A± = (1 ± y^) 2 BHG3- The corresponding spectrum (|30|) for the Student ensemble is then: 

^ (A) = ^f^T^^^r V(A + -J/)(j/-A-) e --^ 2 - 1 d 2/ . (34) 

The integral over dy can be easily computed numerically. Results of this computation for different values of v are 
shown in Fig. ^ For increasing v the spectrum p v {\) tends to the Wishart distribution but even for very large v it 
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FIG. 1: Spectra of the covariance matrix c for the Student distribution © with C = Ijv and A = It, r = N/T = 0.1, for 
v — 1/2, 2, 5, 20 and 100 (thin lines from solid to dotted), calculated using the formula l|M4[l and compared to the uncorrelated 
Wishart (thick line). One sees that for v — > oo the spectra tend to the Wishart distribution. 




FIG. 2: Spectra of the empirical covariance matrix c calculated from Eq. il'-i II with r — 1/3, compared to experimental data 
(stair lines) obtained by the Monte Carlo generation of finite matrices TV = 50, T = 150. Inset: the left part of the same 
distributions, points represent experimental data. 

has a tail which touches A = as follows from Eq. I|32|l . In Fig. |2]we have plotted p„(A) for v — 0.5, 1 and 2 and 
compared them to experimental results obtained by the Monte-Carlo generation of random matrices drawn from the 
corresponding ensemble with the probability measure © for which eigenvalue densities were computed by numerical 
diagonalization. The agreement is perfect. Actually it is even better than for the Gaussian case for the same size N. 

As a second example we consider the case when C has two distinct eigenvalues Ai and A2 with degeneracies: 
(1 — p)N for Ai and pN for A2, where < p < 1. Such a covariance matrix can be used to model the simplest 
effect of sectorization on a stock exchange. For example if all diagonal elements of the matrix C are equal 1 and all 
off-diagonal are equal po (0 < po < 1) the model can be used to mimic a collective behavior on the market |{j llCj. 
In this case Ai = 1 — po has a degeneracy N — 1 and A2 = 1 + (N — l)po is non-degenerated, hence p = 1/N. The 
eigenvector corresponding to the larger eigenvalue A2 can be thought of as describing the correlations of all stocks. 
For our purposes it is however more convenient to set Ai = 1 and A2 = p and p being an arbitrary number between 




FIG. 3: Spectra p u (X) for C having two distinct eigenvalues: 1 and p in proportion (1 — p) : p, calculated from Eq. (1M0> with 
pG given by formula 13511 . with r = 1/10, p = 1/2 and p = 5. Thick solid line corresponds to the Gaussian case v — > oo while 
thin lines to v = 5, 20, 100. These lines are compared to Monte-Carlo results obtained by the generation and diagonalization 
of finite matrices with N = 40, T = 400 (gray lines), which lie almost exactly on top of them and can be hardly seen by an 
unarmed eye. 



zero and one. The corresponding Wishart spectrum pg(A) can be obtained by solving equations given by a conformal 
map |2f)j |. The resulting spectrum has the form: 



Pg(X) = - 



Im 



M{Z{\)) 



where 



M(Z) 

Z(X) 
E 



1 ~P VP 
Z-l Z-n' 

a (1 - iy/3)(3b- a 2 ) {l + iV3)E 



3 • 2 2 / 3 E 



6-2V3 



3%/3 ^27c 2 - I8abc + Aa 3 c + 46 3 - a 2 b 2 - 27c + 9ab - 2a 3 



1/3 



(35) 

(36) 

(37) 
(38) 



where a = r— 1— pr — p(l — pr) — A, b = X(p + 1) — p(l — r) and c = —\p. Inserting the above formula into Eq. 13U|) 
we obtain an integral, which can be computed numerically for arbitrary r,pt,p. In Fig. [3|we show examples of this 
computation for different values of the index v. In the same figure we compare the analytic results with those obtained 
by the Monte Carlo generation and numerical diagonalization of random matrices for N — 40, T = 400. As before, 
the agreement between the analytic and Monte-Carlo results is perfect. We see that the effect on the spectrum of 
introducing heavy tails increases with decreasing v. When v is decreasing from infinity to zero the two disjoint islands 
of the distribution develop a bridge to eventually end up as a distribution having only one connected component. 



VII. SUMMARY 



In the paper we have developed a method for computing spectral densities of empirical covariance matrices for a 
wide class of "quasi- Wishart" ensembles with radial probability measures. In particular we have applied this method 
to determine the spectral density of the empirical covariance matrix for heavy tailed data described by a Student 
multivariate distribution. We have shown that the spectrum p(A) decays like \^ lJ / 2 ^ 1 where v is the index of Student 
distribution. The case of v = 3 is of particular importance since it can be used in modeling stock markets. The 
eigenvalue density spreads over the whole positive semi-axis in contrast to the Wishart spectrum which has a finite 
support. 



We have also derived a general formula for the eigenvalue spectrum of the empirical covariance matrix for radial 
ensembles. The spectrum is given by a one-dimensional integral, which can be easily computed numerically. The 
method works also in the case of correlated assets. 
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